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Introduction 

The  goal  of  this  project  is  the  creation  of  a  computer-based  prognostic  system  for  breast  cancer  that:  is 
significantly  more  accurate  than  the  TNM  staging  system,  predicts  survival  over  time  based  on  therapy, 
and  presents  its  predictions  in  a  manner  that  physicians  can  understand.  This  project  can  be  viewed  as 
consisting  of  three  components:  (1)  Data  analysis  and  prognostic  factor  evaluation,  (2)  developing  the 
prognostic  model,  and  (3)  implementing  a  clinically  useful  system,  i.e.,  breast  cancer  prognostic  factors, 
the  artificial  neural  network  statistical  model,  and  the  clinician  user  interface. 

The  first  year  of  research  was  characterized  by  work  on  the  artificial  neural  network  statistical  model  and 
related  statistical  models,  specifically  tasks  2.1.2  (artificial  neural  network  generating  survival  curves), 
2.1.3  (determining  the  accuracy  of  the  survival  curves),  1.03, 2.1.4  (comparing  the  accuracy  of  the 
artificial  neural  network  to  other  statistical  models),  2.2  (implementing  an  effective  solution  for  missing 
data  in  training  and  performance),  and  2.3  (dealing  with  censored  data). 

In  addition,  during  the  first  year  we  started  work  related  to  data  analysis  and  prognostic  factors  including 
1.02  and  1.08.1  (recurrence  as  an  endpoint),  1.04.2  (creating  a  taxonomy  of  prognostic  factors  in  breast 
cancer),  1.04.3  (writing  a  book  on  prognostic  factors  in  breast  cancer,  in  preparation),  1.06.3 
(determining  minimum  data  set  size),  1.11  (examining  physician  breast  cancer  survival  estimates).  We  also 
began  work  on  3.1  (the  code)  and  3.2  (the  physician  interface). 

Also  during  the  first  year  we  added  three  tasks,  (1)  a  comparison  of  the  two  main  American  cancer  data 
bases,  namely,  the  Surveillance,  Epidemiology,  and  End  Results  and  the  National  Cancer  Data  Base  data 
bases.  (2)  An  examination  of  the  issue  of  what  to  do  when  confronted  with  cases  not  lost  completely  at 
random  and  competing  risks.  (3)  Computerization  of  the  TNM  staging  system. 

The  second  year  of  research  was  characterized  by  the  continuation  of  work  begun  in  the  first  year  and  by 
data  analysis  and  prognostic  factor  development,  specifically  tasks  1.01  and  2.1.1  (extending  the  survival 
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endpoint  from  five  to  ten  years),  1.02  and  1.03  (see  first  year),  1.05  (the  identification  of  high  risk  node 
negative  women),  1.06  (clinical  trials),  and  1.07  (therapy).  Work  continued  on  2.1.2  (artificial  neural 
network  generating  survival  curves),  2.1.3  (creating  a  new  method  for  assessing  prediction  accuracy), 
1.03  and  2.1.4  (model  comparisons).  In  addition,  during  the  second  year  work  began  on  1.04.1  (new 
molecular-genetic  prognostic  factors).  Work  was  completed  on  the  computerization  of  the  TNM  staging 
system  and  the  comparison  of  the  two  national  cancer  data  bases. 
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Detailed  Report:  by  Task 

Task  1.  Data  analysis  and  prognostic  factor  evaluation. 

1.01)  Extend  analysis  of  binary  survival  endpoint  to  10  year  survival. 

We  have  analyzed  SEER  10  year  survival  data.  We  found  that  the  predictors  collected  at  disease  discovery 
are  less  accurate  in  predicting  10  year  survival  than  5  year  survival. 

SEER  1977  -  1982  Breast  Cancer  Data:  10  year  Survival 
Prediction  Accuracy,  TNM  Variables 


PREDICTION  MODEL 

ACCURACY* 

SPECIFICATIONS 

TNM  Stages 

.692 

0,I,IIA  JIB,IIIA  jnB  JV 

Artificial  Neural  Network 

.730§ 

3-5-1 

*  The  receiver  operating  characteristic  area. 

§  p  <  .01 

Five  year  survival  accuracy  for  the  TNM  staging  system  was  .720  and  for  the  artificial  neural  network, 
.784. 

1.02)  Extend  the  analysis  to  recurrence  as  an  endpoint. 

Completed,  in  last  year's  report. 

1.03)  Comparison  of  prognostic  models. 

Completed,  in  last  year's  report. 

1.04.1)  New  prognostic  factors 

We  have  obtained  from  Duke  University  a  data  set  that  contains,  in  addition  to  the  TNM  variables  age, 
estrogen  and  progesterone  receptor  status, 

histology,  p53,  and  erbB-2.  The  accuracy  results  are  shown  below.  These  new  prognostic  factors  produce 
major  increases  in  prognostic  accuracy.  This  is  a  very  encouraging  result. 

Duke  Breast  Cancer  Data:  5  year  Survival  Prediction  Accuracy 


PREDICTION  MODEL 

ACCURACY* 

SPECMCATIONS 

pTNM  Stages 

.567 

Stepwise  Logistic  Regression 

.865 

no  interactions 

Backpropagation  Neural  Network 

.926 

9-5-1 

The  area  under  the  curve  of  the  receiver  operating  characteristic. 


These  results  are  based  on  slightly  more  than  100  cases,  so  the  results  must  be  interpreted  cautiously.  We 
wiU  be  obtaining  several  hundred  additional  cases  from  Duke  in  the  near  future.  We  will  use  the  adthtional 
cases  to  confirm  these  results. 

1.04.2)  Create  a  taxonomy  of  prognostic  factors  in  breast  cancer. 

Completed,  in  last  year's  report. 
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1.04.3)  Prognostic  Factors  in  Breast  Cancer  book. 

The  book  is  being  prepared  for  publication  in  1997.  Burke  HB,  Henson  DE.  Prognostic  Factors  and 
Systems  in  Oncology.  Kluwer  Academic  Publishers  Inc.,  in  preparation. 

1.05)  High  risk  node  negative  women 

Based  on  our  initial  results  with  the  new  prognostic  factors  contained  in  the  Duke  data  set  (1.04.1)  we 
believe  that  some  of  these  factors  will  be  very  useful  in  identifying  high  risk  node  negative  women.  This 
work  is  ongoing. 

1.06.1,  1.06.2)  Clinical  trials 

The  increases  in  prognostic  accuracy  we  described  last  year  and  this  year  suggest  that  we  are  finding  more 
homogeneous  patient  populations.  The  Duke  data  set  contains  treatment  information  and  we  are  currently 
using  it  to  discover  which  new  prognostic  factors  predict  response  to  specific  therapies.  This  work  is 
ongoing. 

1.06.3)  Determining  minimum  data  set  size. 

Completed,  in  last  year's  report. 

1.08.1)  Recurrence  analysis. 

Completed,  in  last  year's  report. 

1.11)  Patient  information  and  physician  credibility. 

Completed,  in  last  year's  report. 

Task  2.  Developing  the  prognostic  model. 

2.1.1)  Generate  survival  curves  for  10  year  data. 

Refer  to  1.01  for  details. 

2.1.2)  Generating  survival  curves. 

Completed,  in  last  year's  report. 

2.1.3)  Determining  the  accuracy  of  the  survival  curves. 

We  are  currently  working  on  a  new  measure  of  prediction  accuracy  that  we  call  "A".  It  includes  the  area 
under  the  receiver  operating  characteristic  as  a  special  case.  This  work  is  ongoing.  We  have  implemented 
several  accuracy  methods  and  derived  the  asymptotic  variance  for  each  in  order  to  assess  the  adequacy  of 
each  method.  (See  Attachment). 

2.1.4)  Comparison  of  artificial  neural  networks  with  Cox  proportional  hazards  model. 

We  began  our  comparison  of  the  Cox  by  examining  whether  breast  cancer  violates  the  proportional  hazards 
assumption  of  the  model.  Proportional  hazards  methods  include  the  Cox  (1972),  and  less  commonly  the 
Weibull  or  exponential  distributions  (Evans,  1993).  Proportional  hazards  methods  assume  that  the  hazard 
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of  each  patient  is  proportional  to  the  hazards  of  all  the  other  patients  and  that  a  individual  patient's  hazard  is 
related  to  that  patient's  relative  risk.  The  Cox  model  does  not  create  survival  etudes.  For  Cox-related 
survival  curves  a  baseline  hazard  must  be  introduced  (Breslow-Cox  estimates;  Breslow,  1974).  Some 
researchers  incorrectly  believe  that  only  regression  methods  that  assume  proportional  hazards  can  deal  with 
censoring,  but  a  multiinterval  regression  model  that  drops  patients  during  the  interval  in  which  they  are 
censored  is  capable  of  dealing  with  censoring.  It  is  always  vital  to  test  the  proportional  hazards  assumption 
when  using  a  regression  method  that  relies  on  it.  There  are  several  methods  for  assessing  proportional 
hazards  violation,  including  Schoenfeld's  partial  residuals  (Schoenfeld,  1982)  and  the  log  hazard  ratio  as  a 
function  of  time  (Gore,  1986).  We  have  created  a  method  somewhat  similar  to  Gore.  We  constmet  a  Cox 
model,  divide  the  time  into  sub-intervals,  and  assess  the  accuracy  of  the  model  for  each  sub-interval.  If 
proportional  hazards  holds,  accuracy  should  be  constant  across  sub-intervals.  Results  for  breast  cancer  are 
shown  below. 

TABLE.  Area  under  the  receiver  operating  characteristic  (Az)  for  two  Cox  models;  breast  cancer  (five 
one-year  intervals)  N  =  1,222  and  melanoma  (three  six-month  intervals)  N  =  60. 


Model/Interval 

1 

2 

3 

4 

5 

Breast  (SE) 

.734  (.057) 

.735  (.036) 

.758  (.038) 

.773  (.040) 

.693  (.041) 

Area  under  the  curve  for  Cox  model  evaluated  at  five  time  intervals 


Because  the  Az  values  are  not  constant  across  the  sub-intervals,  proportional  hazards  does  not  hold  for 
breast  cancer. 

Artificial  neural  networks  are  a  general  regression  method  that  do  not  assume  proportional  hazards  and  can 
capture  nonlinearity  and  complex  interactions  (Burke,  1994, 1995a).  Multiinterval  artificial  neural 
networks  can  handle  censoring  in  the  same  way  that  multiinterval  logistic  regression  models  handle 
censoring.  It  seems  clear  that  proportional  hazards  is  probably  not  appropriate  for  breast  cancer  or  lung 
cancer  (results  not  reported). 

2.2)  Missing  data  (see  discussion  in  last  year's  report  also) 

We  have  completed  work  on  an  efficient  missing  data  mechanism.  It  is  available  for  use  by  interested 
researchers. 

Rosen  DB.  Burke  HB.  Applying  a  gaussian-bernoulli  mixture  model  network  to  binary  and  continuous 
missing  data  in  medicine.  Submitted  for  publication. 

Burke  HB  (ed).  Missing  Data:  Advanced  Methods  and  Models.  M.I.T.  Press,  in  discussion. 
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2.3)  Censored  cases 

Discussed  in  detail  in  last  year's  report  (2.1.2). 

Task  3.  Implementation  of  a  clinically  useful  prognostic  system 

3.1)  Computer  code  for  artificial  neural  networks. 

All  our  work  is  written  in  either  C,  C++,  or  XLISP-STAT. 

3.2)  Physician  interface. 

It  is  very  important  that  physicians  find  the  new  prognostic  system  easy  to  use  and  useful.  To  this  end  we 
have  implemented  the  prognostic  system  on  a  DOS  platform  with  a  Windows  interface.  We  are  presenting 
the  system  to  clinicians  and  receiving  feedback  regarding  what  is  important  to  them  in  terms  of  information 
and  Ae  graphical  display  of  the  information,  (see  2.1.2)  See  Attachment  for  screen  output. 

Tasks  added  to  the  project 

(1)  The  computerization  of  the  TNM  staging  system  for  breast  cancer  has  been  completed  and  has  been 
integrated  into  the  prognostic  system. 

(2)  Comparison  of  the  NCDB  and  the  SEER  data  sets  in  terms  of  breast  cancer. 

We  have  completed  our  comparison  of  the  two  national  breast  cancer  data  bases,  the  National  Cancer  Data 
Base  (NCDB)  and  its  associated  Patient  Care  Evaluation  (PCE),  and  the  Surveillance,  Epidemiology,  and 
End  Results  (SEER)  data  sets.  We  evaluated  them  in  terms  of:  (i)  representativeness,  is  the  data  set  an 
unbiased  representation  of  the  breast  cancer  population,  (ii)  Incidence/prevalence,  how  good  is  the  data  set 
in  capturing  the  incidence  and  prevalence  of  breast  cancer,  (iii)  Prognosis/outcome,  how  good  is  the  data 
set  in  providing  information  that  is  useful  for  predicting  outcome.  An  overview  of  the  results  are  shown 
below. 


SEER 

NCDB 

Representativeness 

good 

good 

Incidence/prevalence 

good 

adequate 

Prognosis/outcome 

not  acceptable 

adequate 

Both  are  representative  of  the  breast  cancer  population.  SEER  does  a  better  job  at  incidence  and  prevalence 
because  it  ascertains  all  cases  in  a  catchment  area,  regardless  of  whether  the  hospital  belongs  to  die 
American  College  of  Surgeons  accreditation  program,  and  it  contains  relatively  little  missing  data.  NCDB 
contains  a  great  deal  of  missing  data.  SEER  can  not  be  used  for  prognosis  because  it  does  not  provide 
therapy  data  due  to  the  unreliability  of  their  data.  NCDB  suffers  from  a  lack  of  follow-up,  resulting  in  high 
censoring  rates. 

Burke  HB,  Hoang  A,  Visintainer,  P.  Comparison  of  the  two  national  cancer  data  sets:  SEER  and 
NCDB.  In  preparation. 


DAMD17-94-J-4383  9 


Conclusions 

The  second  year  of  research  was  characterized  by  the  continuation  of  work  on  the  artificial  neural  network 
statistical  model  and  related  statistical  models  and  by  data  analysis  and  prognostic  factor  development.  In 
summary,  the  research  is  going  well  we  are  ahead  of  our  time  schedule.  We  feel  that  we  will  be  able  to 
successfully  meet  our  goal  of  providing  a  computer-based  prognostic  system  that  is  more  accurate  than  the 
TNM  staging  system  and  that  is  easy  to  use  and  understand  within  the  four  year  time  frame  of  this  grant. 
In  addition,  we  have  created  several  new  systems  that  we  believe  will  advance  the  domain  of  cancer 
prognosis,  e.g.,  artificial  neural  network  survival-over-multi-interval-time  models,  an  effective  missing 
data  method  for  training  and  performance,  and  a  new  approach  to  the  assessment  of  prediction  accuracy. 
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Burke  HB,  Goodman  PH,  Rosen  DB.  A  computerized  prediction  system  for  cancer  patient  survival  that 
uses  an  artificial  neural  network.  Proceedings  of  the  First  World  Congress  on  Computational 
Medicine  and  Public  Health  1995,  in  press 

Burke  HB.  Artificial  neural  networks  and  biomedicine.  Proceedings  of  the  Workshop  on  Environmental 
and  Energy  Applications  of  Neural  Networks  1995,  in  press. 

Burke  HB.  The  importance  of  artificial  neural  networks  in  biomedicine.  Proceedings  of  the  World 

Congress  on  Neural  Networks.  Hillsdale,  NJ:  Lawrence  Erlbaum  Associates  Inc.  1995, 725-30. 

Burke  HB,  Hoang  A,  Rosen  DB.  Survival  function  estimates  in  cancer  using  artificial  neural  networks. 
Proceedings  of  the  World  Congress  on  Neural  Networks.  Hillsdale,  NJ:  Lawrence  Erlbaum 
Assoc.  Inc.  1995,  742-7. 

Burke  HB,  Goodman  PH,  Rosen  DB,  Henson  DE,  Weinstein  JN,  Hellier  JH,  Harrell  Jr.  FE,  Marks  JR, 
Winchester  DP,  Bostwick  DG.  Artificial  neural  networks  improve  cancer  survival  prediction 
accuracy.  Cancer,  in  press. 

Rosen  DB,  Burke  HB,  Goodman  PH.  Improving  prediction  accuracy  using  a  calibration  postprocessor. 
Proceedings  of  the  World  Congress  on  Neural  Networks.  Hillsdale,  NJ:  Lawrence  Erlbaum 
Assoc.  Inc.  1996,  in  press 

Burke  HB.  Measuring  classification/prediction  accuracy.  Proceedings  of  the  World  Congress  on  Neural 
Networks.  Hillsdale,  NJ:  Lawrence  Erlbaum  Assoc.  Inc.  1996,  in  press. 
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Rosen  DB,  Burke  HB.  Applying  a  gaussian-bemoulli  mixture  model  network  to  binary  and  continuous 
missing  data  in  medicine.  Submitted  for  publication. 

Burke  HB.  The  TNM  staging  system.  In  preparation. 

Hoang  A,  Burke  HB.  M:  A  measure  of  discriminative  accuracy.  In  preparation. 

Burke  HB,  Henson  DE,  Bostwick  DG,  Hoang  A.  Backward  and  forward  relationships  between  prior 
events  and  disease.  In  preparation. 

Hoang  A,  Burke  HB.  Methods  for  dealing  with  cases  nonrandomly  lost-to-follow-up.  In  preparation. 
Burke  HB,  Hoang  A.  Assessment  of  multi-interval  event  models.  In  preparation. 

Burke  HB,  Hoang  A,  Visintainer  P.  Comparison  of  the  two  national  cancer  data  sets:  SEER  and  NCDB. 
In  preparation. 


Invited  papers 

Burke  HB.  Defining  the  computer-aided  diagnostic  device  domain.  Proceedings  of  the  World  Congress  on 
Neural  Networks.  Hillsdale,  NJ:  Lawrence  Erlbaum  Assoc.  Inc.  1996,  in  press. 


Presented  papers 

Burke  HB.  Outcome  prediction  in  cancer.  1995  International  Conference  on  Health  Policy  Research: 
Methodologic  Issues  in  Health  Services  and  Outcome  Research,  Harvard  Medical  School, 
Boston  MA,  December  2-3, 1995. 

Burke  HB.  Defining  the  Computer-aided  Diagnostic  Device  Domain.  U.S.  Food  and  Drug  Administration 
Computer-aided  Diagnostic  Device  Workshop,  Rockville  MD,  January  26, 1996. 

Burke  HB.  Computer-based  Clinical  Decision  Support  System  in  Oncology.  The  Eli  Lilly  Lecture, 
Conference  on  Prognostic  Factors  and  Rational  Treatment  of  Cancer,  Yorkshire  Cancer 
Organization,  Leeds  UK,  July  3,  1996. 

Burke  HB.  Issues  in  the  regulation  of  mectical  software  devices.  Plenary  session,  FDA/NLM  Software 
Policy  Workshop,  National  Institutes  of  Health,  Bethesda  MD,  September  3-4, 1996. 

Rosen  DB,  Burke  HB,  Goodman  PH.  Improving  prediction  accuracy  using  a  calibration  postprocessor. 

World  Congress  on  Neural  Networks,  San  Diego  CA,  September  15  -  20, 1996. 

Burke  HB.  Measuring  classification/prediction  accuracy.  World  Congress  on  Neural  Networks,  San 
Diego  CA,  September  15-20, 1996. 

Burke  HB.  Defining  the  computer-aided  diagnostic  device  domain.  World  Congress  on  Neural  Networks, 
San  Diego  CA,  September  15  -  20, 1996. 

Conference  positions 

1995  Scientific  Advisor.  Cancer  Prognostic  Factors  Conference,  Cambridge  Healthtech  Institute, 
Arlington,  VA,  June  7-8, 1995. 

1995  Program  Committee  and  Co-chair.  Biomedical  Applications  of  Neural  Networks  section.  World 
Congress  on  Neural  Networks  and  1995  International  Neural  Network  Society  Annual  Meeting, 
Washington  DC,  July  17-21, 1995. 

1995  Workshop  Chair,  Missing  Data:  Methods  and  Models.  Neural  Information  Processing  Systems 

1995,  Vail  CO,  December  1, 1995. 

1996  Program  Committee  and  Co-chair,  Biomedical  Applications  Section,  World  Congress  on 
Neural  Networks  and  1996  International  Neural  Network  Society  Annual  Meeting,  San  Diego 
CA,  September  15-20,  1996. 

1996  Workshop  Chair,  Model  Accuracy:  Issues  and  Methods.  Neural  Information  Processing  Systems 

1996,  Vail  CO,  December  2, 1996. 

1996  Co-organizer,  p53:  A  Prognostic  Factor  in  Cancer.  National  Cancer  Institute,  in  preparation  for 
Spring,  1997,  Bethesda,  MD. 


12 


V 

H- 

I 

CM 


>1 


n 

2 


>1 

II 

>H 

T3 

C 

X 

X 

4-1 

ffl 

JC 

4J 


o 

w 


>4 

x" 

w 


a 


o 

l-( 

d) 

I 

:3 

c 

II 

>. 

X 

z 


II 

>1 


■r 


II 

X 

H-l 


+ 


<i> 


X 

3: 

>1 


>1 

S 

X 

2 

II 

>1 

X 

> 


II 


II 


5^ 

pr 


[TyCN^^-Lylfrl'^^^  -  -  W  •  y)j  +  N^[P{N  -N-y)-iN-  Nx-)L\  -  A;eyL^]  +  N[iN  -  Nx-)L\  -  Lx(N  -  N  ■  y)4]| 


13 


Palienl  Name 
Palienl  ID  |  ~ 

Institution 
Date 

Physician  J 
Cancer  Type  I  Breast 


Prediction  Type  k  Year  Survival  Curve 


T  umor 
T umor  Size 
Lymph  Nodes  Positive 
Lymph  Nodes  Examined 
Distant  Metastasis 
Estrogen  Receptor 
Progesterone  Receptor 
Menopausal  Status 
Grade 
Age 

Lymph  Nodes  pTNM  [ 


Patient  Name: 
Patient  ID: 
Institution : 
Date : 

Physician : 
Cancer  Type : 
Prediction  Type 


Factors : 
Tumor : 

T  Size: 

LN  Exam: 
ER: 

Menopausal 
Age  : 


tsize  = 


LN  Pos: 
Mets : 

PR: 

Grade : 
LN  pTNM; 


Year 

ANN  Prediction 

P 

1 . 0 

0 

1 . 000 

r 

o 

b 

0 . 8 

1 

0 . 983 

a 

b 

0 . 6 

2 

0 . 898 

i 

1 

0.4 

3 

0 . 82 

i 

t 

0 . 2 

4 

0 . 775 

y 

0 . 0 

5 

0 . 721 

2  3 

Year 


TNM  Stage:  IIIA 


Prediction : 


0 . 595 


